189 research outputs found
DeepCCI: End-to-end Deep Learning for Chemical-Chemical Interaction Prediction
Chemical-chemical interaction (CCI) plays a key role in predicting candidate
drugs, toxicity, therapeutic effects, and biological functions. In various
types of chemical analyses, computational approaches are often required due to
the amount of data that needs to be handled. The recent remarkable growth and
outstanding performance of deep learning have attracted considerable research
attention. However,even in state-of-the-art drug analysis methods, deep
learning continues to be used only as a classifier, although deep learning is
capable of not only simple classification but also automated feature
extraction. In this paper, we propose the first end-to-end learning method for
CCI, named DeepCCI. Hidden features are derived from a simplified molecular
input line entry system (SMILES), which is a string notation representing the
chemical structure, instead of learning from crafted features. To discover
hidden representations for the SMILES strings, we use convolutional neural
networks (CNNs). To guarantee the commutative property for homogeneous
interaction, we apply model sharing and hidden representation merging
techniques. The performance of DeepCCI was compared with a plain deep
classifier and conventional machine learning methods. The proposed DeepCCI
showed the best performance in all seven evaluation metrics used. In addition,
the commutative property was experimentally validated. The automatically
extracted features through end-to-end SMILES learning alleviates the
significant efforts required for manual feature engineering. It is expected to
improve prediction performance, in drug analyses.Comment: ACM-BCB 201
Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis, and beyond
A wide variety of large-scale data has been produced in bioinformatics. In
response, the need for efficient handling of biomedical big data has been
partly met by parallel computing. However, the time demand of many
bioinformatics programs still remains high for large-scale practical uses due
to factors that hinder acceleration by parallelization. Recently, new
generations of storage devices have emerged, such as NAND flash-based
solid-state drives (SSDs), and with the renewed interest in near-data
processing, they are increasingly becoming acceleration methods that can
accompany parallel processing. In certain cases, a simple drop-in replacement
of hard disk drives (HDDs) by SSDs results in dramatic speedup. Despite the
various advantages and continuous cost reduction of SSDs, there has been little
review of SSD-based profiling and performance exploration of important but
time-consuming bioinformatics programs. For an informative review, we perform
in-depth profiling and analysis of 23 key bioinformatics programs using
multiple types of devices. Based on the insight we obtain from this research,
we further discuss issues related to design and optimize bioinformatics
algorithms and pipelines to fully exploit SSDs. The programs we profile cover
traditional and emerging areas of importance, such as alignment, assembly,
mapping, expression analysis, variant calling, and metagenomics. We explain how
acceleration by parallelization can be combined with SSDs for improved
performance and also how using SSDs can expedite important bioinformatics
pipelines, such as variant calling by the Genome Analysis Toolkit (GATK) and
transcriptome analysis using RNA sequencing (RNA-seq). We hope that this review
can provide useful directions and tips to accompany future bioinformatics
algorithm design procedures that properly consider new generations of powerful
storage devices.Comment: Availability: http://best.snu.ac.kr/pub/biossd; to be published in
Briefings in Bioinformatic
Manifold Regularized Deep Neural Networks using Adversarial Examples
Learning meaningful representations using deep neural networks involves
designing efficient training schemes and well-structured networks. Currently,
the method of stochastic gradient descent that has a momentum with dropout is
one of the most popular training protocols. Based on that, more advanced
methods (i.e., Maxout and Batch Normalization) have been proposed in recent
years, but most still suffer from performance degradation caused by small
perturbations, also known as adversarial examples. To address this issue, we
propose manifold regularized networks (MRnet) that utilize a novel training
objective function that minimizes the difference between multi-layer embedding
results of samples and those adversarial. Our experimental results demonstrated
that MRnet is more resilient to adversarial examples and helps us to generalize
representations on manifolds. Furthermore, combining MRnet and dropout allowed
us to achieve competitive classification performances for three well-known
benchmarks: MNIST, CIFAR-10, and SVHN.Comment: Figure 2, 5, 7, and several descriptions revise
Building a Neural Machine Translation System Using Only Synthetic Parallel Data
Recent works have shown that synthetic parallel data automatically generated
by translation models can be effective for various neural machine translation
(NMT) issues. In this study, we build NMT systems using only synthetic parallel
data. As an efficient alternative to real parallel data, we also present a new
type of synthetic parallel corpus. The proposed pseudo parallel data are
distinct from previous works in that ground truth and synthetic examples are
mixed on both sides of sentence pairs. Experiments on Czech-German and
French-German translations demonstrate the efficacy of the proposed pseudo
parallel corpus, which shows not only enhanced results for bidirectional
translation tasks but also substantial improvement with the aid of a ground
truth real parallel corpus
Homomorphic Parameter Compression for Distributed Deep Learning Training
Distributed training of deep neural networks has received significant
research interest, and its major approaches include implementations on multiple
GPUs and clusters. Parallelization can dramatically improve the efficiency of
training deep and complicated models with large-scale data. A fundamental
barrier against the speedup of DNN training, however, is the trade-off between
computation and communication time. In other words, increasing the number of
worker nodes decreases the time consumed in computation while simultaneously
increasing communication overhead under constrained network bandwidth,
especially in commodity hardware environments. To alleviate this trade-off, we
suggest the idea of homomorphic parameter compression, which compresses
parameters with the least expense and trains the DNN with the compressed
representation. Although the specific method is yet to be discovered, we
demonstrate that there is a high probability that the homomorphism can reduce
the communication overhead, thanks to little compression and decompression
times. We also provide theoretical speedup of homomorphic compression.Comment: 8 pages, 7 figure
HexaGAN: Generative Adversarial Nets for Real World Classification
Most deep learning classification studies assume clean data. However, when
dealing with the real world data, we encounter three problems such as 1)
missing data, 2) class imbalance, and 3) missing label problems. These problems
undermine the performance of a classifier. Various preprocessing techniques
have been proposed to mitigate one of these problems, but an algorithm that
assumes and resolves all three problems together has not been proposed yet. In
this paper, we propose HexaGAN, a generative adversarial network framework that
shows promising classification performance for all three problems. We interpret
the three problems from a single perspective to solve them jointly. To enable
this, the framework consists of six components, which interact with each other.
We also devise novel loss functions corresponding to the architecture. The
designed loss functions allow us to achieve state-of-the-art imputation
performance, with up to a 14% improvement, and to generate high-quality
class-conditional data. We evaluate the classification performance (F1-score)
of the proposed method with 20% missingness and confirm up to a 5% improvement
in comparison with the performance of combinations of state-of-the-art methods.Comment: Accepted to ICML 201
Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation
In this paper, we address the problem of image anomaly detection and
segmentation. Anomaly detection involves making a binary decision as to whether
an input image contains an anomaly, and anomaly segmentation aims to locate the
anomaly on the pixel level. Support vector data description (SVDD) is a
long-standing algorithm used for an anomaly detection, and we extend its deep
learning variant to the patch-based method using self-supervised learning. This
extension enables anomaly segmentation and improves detection performance. As a
result, anomaly detection and segmentation performances measured in AUROC on
MVTec AD dataset increased by 9.8% and 7.0%, respectively, compared to the
previous state-of-the-art methods. Our results indicate the efficacy of the
proposed method and its potential for industrial application. Detailed analysis
of the proposed method offers insights regarding its behavior, and the code is
available online
One-Shot Learning for Text-to-SQL Generation
Most deep learning approaches for text-to-SQL generation are limited to the
WikiSQL dataset, which only supports very simple queries. Recently,
template-based and sequence-to-sequence approaches were proposed to support
complex queries, which contain join queries, nested queries, and other types.
However, Finegan-Dollak et al. (2018) demonstrated that both the approaches
lack the ability to generate SQL of unseen templates. In this paper, we propose
a template-based one-shot learning model for the text-to-SQL generation so that
the model can generate SQL of an untrained template based on a single example.
First, we classify the SQL template using the Matching Network that is
augmented by our novel architecture Candidate Search Network. Then, we fill the
variable slots in the predicted template using the Pointer Network. We show
that our model outperforms state-of-the-art approaches for various text-to-SQL
datasets in two aspects: 1) the SQL generation accuracy for the trained
templates, and 2) the adaptability to the unseen SQL templates based on a
single example without any additional training
Learning Condensed and Aligned Features for Unsupervised Domain Adaptation Using Label Propagation
Unsupervised domain adaptation aiming to learn a specific task for one domain
using another domain data has emerged to address the labeling issue in
supervised learning, especially because it is difficult to obtain massive
amounts of labeled data in practice. The existing methods have succeeded by
reducing the difference between the embedded features of both domains, but the
performance is still unsatisfactory compared to the supervised learning scheme.
This is attributable to the embedded features that lay around each other but do
not align perfectly and establish clearly separable clusters. We propose a
novel domain adaptation method based on label propagation and cycle consistency
to let the clusters of the features from the two domains overlap exactly and
become clear for high accuracy. Specifically, we introduce cycle consistency to
enforce the relationship between each cluster and exploit label propagation to
achieve the association between the data from the perspective of the manifold
structure instead of a one-to-one relation. Hence, we successfully formed
aligned and discriminative clusters. We present the empirical results of our
method for various domain adaptation scenarios and visualize the embedded
features to prove that our method is critical for better domain adaptation
How Generative Adversarial Networks and Their Variants Work: An Overview
Generative Adversarial Networks (GAN) have received wide attention in the
machine learning field for their potential to learn high-dimensional, complex
real data distribution. Specifically, they do not rely on any assumptions about
the distribution and can generate real-like samples from latent space in a
simple manner. This powerful property leads GAN to be applied to various
applications such as image synthesis, image attribute editing, image
translation, domain adaptation and other academic fields. In this paper, we aim
to discuss the details of GAN for those readers who are familiar with, but do
not comprehend GAN deeply or who wish to view GAN from various perspectives. In
addition, we explain how GAN operates and the fundamental meaning of various
objective functions that have been suggested recently. We then focus on how the
GAN can be combined with an autoencoder framework. Finally, we enumerate the
GAN variants that are applied to various tasks and other fields for those who
are interested in exploiting GAN for their research.Comment: 41 pages, 16 figures, Published in ACM Computing Surveys (CSUR
- …